TEI and LMF crosswalks
نویسنده
چکیده
The present paper explores various arguments in favour of making the Text Encoding Initiative (TEI) guidelines an appropriate serialisation for ISO standard 24613:2008 (LMF, Lexical Mark-up Framework). It also identifies the issues that would have to be resolved in order to reach an appropriate implementation of these ideas, in particular in terms of informational coverage. We show how the customisation facilities offered by the TEI guidelines can provide an adequate background, not only to cover missing components within the current Dictionary chapter of the TEI guidelines, but also to allow specific lexical projects to deal with local constraints. We expect this proposal to be a basis for a future ISO project in the context of the on going revision of LMF. Since this paper adopts the specific viewpoint of the TEI guidelines, no precise description of LMF is made here. For an introduction to LMF, see section 4 of (ROMARY 2013). 1 Towards a more intimate relationship between the TEI and the LMF standards This chapter is about a simple thesis: the TEI framework could be the optimal serialisation background for the LMF standard, since it provides both an ideal XML specification platform and a representation vocabulary that can be easily tuned (or customized) to cover the various LMF packages and components. This thesis does not come out of the blue but arises naturally when one observes the history of both initiatives, and their current impacts in various communities in the humanities and in computational linguistics, but also when one ponders on the relevance of having an LMF-specific serialisation when lexical data are in essence to be interconnected with various other types of linguistic resources. As a matter of fact, the current XML serialisation of LMF suffers from both generic and specific problems that have prevented it from being widely used by the various communities interested in digital lexical resources. Right from the onset, the lack of consensus on the strategy to define a reliable and stable XML serialisation has forced the ISO working group on LMF to confine it to an informative annex, with the following main shortcomings: Being carved in stone within the ISO standard, rather than being pointed to as an external and stable online resource, prevents it from being properly maintained, in order to either make corrections on identified weak points or bugs, or to add additional features; It is only defined as a DTD, a vestigial XML schema language that hardly any XML developer currently uses anymore and which deeply limits its capacity to express constraints on types or to factorise global attributes. For the sake of simplicity (and this can be easily understood when one has to finalise a text for an ISO standard) no parallel definition of a RelaxNG or W3C schema was provided; It does not reflect the intrinsic extensibility of LMF, as it does not contain any dedicated mechanism for customization, for instance when the developer of a new lexical model would like to discard some packages or add her own extensions;
منابع مشابه
Semantisierung des Textes im Lichte und im Schatten der Text Encoding Initiative (TEI)
Der Ansatz der Textauszeichnung (“textual markup”) ist zur semantischen Erschließung von Texten des kulturellen Erbes nahezu alternativlos. Die Stärke von Auszeichnungssprachen, die ihren Gegenstand zugleich als sequentielle, als hierarchische und als netzartige Datenstruktur behandeln, kommt insbesondere den komplexen Wiedergabeund Analyse-Anforderungen an historische Texte entgegen. Lokale An...
متن کاملA prototype for projecting HPSG syntactic lexica towards LMF
The comparative evaluation of Arabic HPSG grammar lexica requires a deep study of their linguistic coverage. The complexity of this task results mainly from the heterogeneity of the descriptive components within those lexica (underlying linguistic resources and different data categories, for example). It is therefore essential to define more homogeneous representations, which in turn will enabl...
متن کاملAccessing and Standardizing Wiktionary Lexical Entries for Supporting the Translation of Labels in Taxonomies for Digital Humanities
We describe the usefulness of Wiktionary, the freely available web-based lexical resource, in providing multilingual extensions to catalogues that serve content-based indexing of folktales and related narratives. We develop conversion tools between Wiktionary and TEI, using ISO standards (LMF, MAF), to make such resources available to both the Digital Humanities community and the Language Resou...
متن کاملA Combined Fuzzy Logic and Analytical Hierarchy Process Method for Optimal Selection and Locating of Pedestrian Crosswalks
One of the main challenges for transportation engineers is the consideration of pedestrian safety as the most vulnerable aspect of the transport system. In many countries around the world, a large number of accidents recorded by the police are composed of accidents involving pedestrians and vehicles, for example when pedestrians may be struck by passing vehicles when crossing the street. Carefu...
متن کاملMultiwavelength-integrated local model fitting method for interferometric surface profiling.
The local model fitting (LMF) method is a useful single-shot surface profiling algorithm that features fast measurement speed and robustness against vibration. However, the measurement range of the LMF method (i.e., measurable height difference between two neighboring pixels) is limited up to a quarter of the light source wavelength. To cope with this problem, the multiwavelength-matched LMF(MM...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
- JLCL
دوره 30 شماره
صفحات -
تاریخ انتشار 2015